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!•  Introduction 

Members  of  organizations  communicate  with  each  other  in  order  to  coordinate  their  actions  in  pursuit 
of  a  common  goal.  Bilateral  information  exchanges  can  be  broadly  classified  as  monologs  or  dialogs.  A 
typical  monolog  would  be  a  published  forecast  of  market  conditions  in  response  to  which  a  manager 
chooses  an  inventory  level.  A  typical  dialog  could  be  a  question-and-answer  session;  e.g.  the  chief 
strategist  describes  a  new  product,  the  marketing  prognosticator  announces  a  sales  estimate  based  on  that 
description,  and  then  the  strategist  decides  whether  to  introduce  the  product. 

Our  common  sense  and  experience  tell  us  that  dialog,  if  not  in  some  sense  dominant,  is  certainly  a 
ubiquitous  mode  of  communication  in  the  real  world.  It  is  provocative,  then,  to  compare  this  observed 
dialog-rich  communication  with  the  type  of  discourse  manifested  in  traditional  economic  analysis.  In 
most  models  with  asymmetric  information  an  agent  is  called  upon  to  report  her  private  information  to  a 
principal.  Consideration  is  usually  restricted  to  mechanisms  in  which  the  agent  fully  and  truthfully 
reveals  the  entirety  of  her  private  information  (i.e.  she  declares  her  "type")-^  In  other  words  these  are 
models  of  a  monolog  world.  (There  is  no  need  to  ask  a  question  of  someone  who  is  planning  to  tell  you 
everything  she  knows,  because  you  know  that  you  will  receive  the  answer  to  every  question  which  she 
could  possibly  answer.) 

Why  do  we  observe  dialog  in  the  world  but  not  in  our  models?  Theorists  invoke  the  Revelation 
Principle  in  order  to  claim  that  no  generality  is  lost  when  they  focus  upon  tell-everything  monolog 
mechanisms.  This  exploitation  of  the  Revelation  Principle  is  only  valid,  however,  when  communication 
is  costless. 2  When  communication  is  costly,  full  revelation  of  type  may  be  so  expensive  as  to  be 
suboptimal,  and  efficiency  in  communication  becomes  a  relevant  criterion  for  the  mechanism  designer. 
This  raises  new  questions  of  who  should  say  what  to  whom  and  when  and  thereby  introduces  the 
possibility  of  observing  dialog.^ 

The  costs  of  communication  is  a  theoretical  lacuna  which  begs  to  be  closed.  As  Tirole  [1988:  49]  has 
pointed  out:  "Neoclassical  theory  pays  only  lip  service  to  the  issue  of  communication.  Information  flows 
between  members  of  an  organization  are  limited  only  because  of  incentive  compatibility  . . .  However, 
even  well-intentioned  members  of  an  organization  ...  may  have  trouble  communicating  all  the 
information  they  possess  to  their  relevant  co-members,  because  it  is  too  time  consuming  or  because  the 
information  is  hard  to  'codify'  to  make  it  understandable  to  its  receivers.  Thus,  decisions  that  would  be 
profit  maximizing  under  full  communication  will  not  be  made  under  imperfect  communication."  Arrow 
[1974:  5]  advocates  that  before  we  can  rigorously  evaluate  the  market  mechanism's  claim  to  superior 


In  some  circumstances  efficiency  can  be  achieved  without  full  revelation.  Sec  Groves  and  Ledyard  [1977],  Drfize  and  de  la  Valine 
Poussin  [1971],  and  Malinvaud  [1971]. 

McAfee  and  McMillan  [1988]  extend  the  Revelation  Principle  to  a  case  in  which  the  principal  bears  a  communication  cost  but  the 
agents  do  not. 

Even  more  generally,  costly  communication  has  the  potential  to  explain  the  existence  of  hierarchical  organizational  structures.  (One 
interpretation  of  the  Revelation  Principle  holds  that  the  outcome  of  a  decentralized  organization  can  be  replicated  by  a  centralized  two- 
der  structure;  therefore  hierarchy  can  never  be  strictly  preferred.)  See  Melumad,  Mookhetjee,  and  Reichelstein  [1989]  for  work  in  this 
direction  which  adopts  a  message  space  dimensionality  perspective  on  communication  cost. 
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informational  economy  we  must  "add  to  our  usual  economic  calculations  an  appropriate  measure  of  the 
costs  of  information  gathering  and  transmission." 

Communication  costs  lead  to  bounded  rationality  because  they  impose  cognitive  limitations — via 
knowledge  restrictions — upon  decisionmakers.  Limited  communication  is  similar  formally  to  other 
sources  of  bounded  rationality,  for  example  limited  memory.  Dow  [1991]  studies  an  agent  searching  for 
a  low  price  and  asks  how  the  agent  can  make  optimal  use  of  her  limited  memory  for  observed  prices. 
Clearly  this  is  equivalent  to  a  team^  communication  problem  with  two  partners.  Past  and  Present,  where 
the  design  decision  is  how  Past  should  use  hmited  communication  resources  to  inform  Present  about  the 
price  Past  observed. ^ 

When  communication  is  free  and  truthful  revelation  is  assured,  all  relevant  information  can  be 
transmitted  to  a  specified  decisionmaker.  However,  when  communication  costs  inhibit  a  comprehensive 
information  exchange,  the  initial  informational  asymmetries  are  not  completely  obliterated;  by  the  end  of 
the  conversation  some  agents  still  know  some  things  which  others  do  not  know.  Some  agent  may  be 
more  qualified  to  make  the  choice-of-action  decision,  and  therefore  the  designer  must  determine  to 
whom  the  decision  would  be  optimally  delegated  and  under  what  circumstances. 

In  this  paper  we  posit  the  existence  of  communication  costs  in  a  bilateral  team  context.  Our  task  is  to 
determine  the  exact  process  by  which  private  information  should  be  shared  and  decisionmaking 
authority  should  be  delegated.  We  will  see  that  indeed  there  exist  cases  in  which  it  would  be  suboptimal 
for  one  party  to  engage  in  a  monolog  with  the  other;  i.e.  a  dialog  would  achieve  a  higher  performance 
standard  for  a  fixed  communication  cost. 

We  are  intrigued  by  three  features  of  the  optimal  information  sharing  mechanisms  we  have 
encountered  during  our  research.  First,  they  can  have  an  unpredictable  hfe  of  their  own.  To  guarantee 
efficiency  it  is  not  sufficient  that  we  merely  enrich  the  designer's  toolkit  to  include  a  simplistic 
capability  of  alternating  question  and  answer.  A  rigid,  fixed-sequence  dialog — which  dictates  that  the 
microphone  change  hands  in  a  particular,  predetermined  order  and  prescribes  ahead  of  time  a  particular 
agent  as  the  ultimate  decisionmaker — can  be  suboptimal  just  as  a  rigid  monolog  can  be.  Instead,  the 
designer  must  supply  a  communication  algorithm,  according  to  which  the  identity  of  the  speaker  at  any 
stage  and  the  identity  of  the  ultimate  decisionmaker  are  determined  endogenously  by  the  particular 
realization  of  the  agents*  private  data.  Just  as  a  software  engineer  carmot  predict  precisely  what  output 
will  result  from  a  computer  program  without  first  knowing  its  input  data,  the  mechanism  designer 
cannot — in  ignorance  of  the  agents'  private  information — predict  the  sequence  of  speakers  in  the 
conversation  or  the  identity  of  the  decisionmaker. 

Secondly,  it  can  be  strictly  better  if  occasionally  neither  party  speaks — letting  one  of  them  choose  an 
action  in  complete  ignorance  of  the  other's  private  information — even  when  they  have  already  paid  for 
the  right  to  communicate  and  when  the  choice  of  action  could  be  improved  by  more  precise  knowledge 


A  team  in  the  sense  of  Marschak  and  Radner  [1972]  is  a  group  whose  members  have  only  common  interests. 

This  communication  is  clearly  limited  to  a  monolog  in  which  Past  speaks  to  Present.  The  reverse  direction  would  be  a  case  of 
paranormal  fortunetelling  rather  than  bounded  rationality! 
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of  that  private  information.^  The  intuition  for  this  counterintuitive  result  is  that  a  dehberate  silence  in 
some  circumstances  increases  the  informativeness  of  communication  in  the  remaining  circumstances.  A 
loss  in  decisionmaking  quaUty  due  to  remaining  silent  in  a  circumstance  in  vhich  communication  would 
be  relatively  ineffective  anyway  can  be  traded-off  against  a  gain  generated  by  the  resulting  increased 
informativeness  in  cases  where  communication  is  more  crucial. ^ 

A  third  interesting  feature  of  these  optimal  communication  algorithms  concerns  who  dominates  the 
conversation.  We  will  see  that  the  identity  of  the  agent  who  should  do  the  most  talking  can  depend  on 
the  size  of  the  communication  budget.  If  you  cannot  pay  for  much  talking,  perhaps  one  agent  is  the 
preferred  lecttirer.  If  you  have  more  to  spend,  it  can  be  optimal  to  hear  instead  from  the  other. 

Although  we  show  that  dialog  is  sometimes  necessary  for  efficiency,  we  do  establish  sufficient 
conditions  under  which  monolog  is  efficient.  Mathematically  this  takes  the  form  of  an  additive 
separability  requirement.  An  interpretation  of  this  result  is  that  monolog  will  be  sufficient  when  the  error 
in  the  final  resuh — generated  by  a  given  misestimate  by  the  decisionmaker  of  her  partner's  private 
information — is  independent  of  the  decisionmaker's  private  information.  For  monolog  efficiency,  then, 
it  is  sufficient  that  the  decision  problem  have  a  structure  in  which  the  partners'  private  data  do  not 
interact  in  this  well-defined  sense. 


The  problem 


In  order  to  focus  on  the  analytical  issues  raised  by  costly  communication  we  will  abstract  away  from 
incentive  issues  and  study  the  following  problem:^  Consider  a  team  of  two  privately  informed  members. 
The  designer's  task  is  to  construct  a  mechanism  through  which  the  team  members  share  their  private 
information  with  one  another  in  order  that  one  of  them  will  make  a  decision  which  would  optimally 
depend  on  both  of  their  privately  known  data.  (The  designer  does  not  know  at  the  time  of  design  what 
each  agent's  private  datum  will  be  at  the  time  the  mechanism  is  implemented.) 

Our  communication  cost  measure  is  communication  length  and  is  worst-case:  we  pay  for  the  number 
of  binary  digits  (bits)  in  the  longest  possible  exchange  (i.e.  over  all  possible  private  data  realizations)."* 
Thus  we  are  emphasizing  that  communication  takes  time  and  that  this  is  a  principal  part  of  its  cost.^  A 

'  Such  an  algorithm  bean  a  formal  resemblance  to  "management  by  exception"  in  which  more  communication  occurs  when  private 
information  take*  on  "exceptional"  values.  (Marschak  and  Radner  [1972:  206-207]) 

^  This  phenomenon  can  also  occur  in  Dow's  [1991]  limited-memory  search  model.  No  communication  from  Past -♦  Present  is  equivalent 
to  buying  the  good  from  the  first  firm  visited  without  bothering  to  visit  the  second  firm  to  compare  prices.  Dow  prohibits  buying 
without  observing  both  firms'  prices.  Kofman  and  Ratliff  [1991b]  show  that  the  shopper  would  be  strictly  better  off  if  she  could  waive 
the  right  to  comparison  shop.  By  purchasing  without  comparing  prices  when  the  first  firm's  price  is  very  low  (which  is  almost  certainly 
the  correct  decision  and  is  not  a  very  costly  mistake  even  when  it  is  incorrect),  she  husbands  her  memory  to  more  finely  discnminate 
between  higher  prices  for  the  first  firm.  This  improves  her  decision  about  which  firm  to  buy  from  in  the  situations  in  which  the  wrong 
decision  would  be  more  costly. 

^       We  thank  Tom  Marschak  for  posing  to  us  a  discrete  formulation  of  this  problem  from  which  the  present  paper  evolved. 

*  A  similar  problem  has  been  studied  in  the  computer  science  literature  concerning  distiibuted  and  parallel  processing.  It  takes  the 
discrete  form  of  approximating  a  Boolean-valued  function  defined  on  a  grid.  See  Yao  [79],  Abelson  [80],  and  Karp  and  Ng 
[forthcoming]. 

^  This  information  theoretic  approach  to  measuring  communication  is  used  by  Oniki  [1986],  who  compares  the  informational  efficiency 
of  an  auctioneer-mediated  market  with  that  of  a  centralized  system  for  a  particular  single-good  production  economy.  This  contrasts  with 
another  strand  in  the  literature  in  which  the  communication  constraint  is  the  dimensionality  of  the  message  space  (i.e.  the  number  of 
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mechanism  is  efficient  if  it  results  in  a  worst-case  decision  error  no  greater  than  that  of  any  other 
mechanism  which  pays  for  the  same  number  of  bits.i  For  a  given  n  we  search  for  an  efficient  n-bit 
algorithm  in  order  to  establish  the  minimum  worst-case  decision  error  associated  with  n  bits.  By  varying 
n  we  then  can  construct  the  efficient  error-communication  length  frontier. 

We  now  present  two  scenarios  as  exemplar  problems  which  can  be  analyzed  using  the  tools  we 
develop  in  this  paper. 

"ETiiampCe  1:  Team- spying  the  tTqpCosivts  factory 

Two  spies,  Boris  and  Natasha,  will  be  placed  behind  enemy  lines  to  determine  the  amount  of 
explosives  being  produced  by  a  factory.  The  explosives  are  manufactured  from  two  inputs  X  and  Y 
according  to  a  well-known  production  function.  These  inputs  are  produced  in  separate  regions  and  hence 
arrive  at  the  factory  along  different  roads.  The  maximum  capacity  of  each  road  to  handle  truck  traffic  is 
well  known.  From  his  vantage  point  Boris  can  observe  with  complete  precision  the  flow  of  X  into  the 
factory  but  cannot  observe  at  all  the  amount  of  F;  similarly  Natasha  can  observe  the  flow  of  Y  but  not 
that  of  X.  (See  Figure  1.) 

Delivery  route 
for  input  X 


N 


dl 

Explosives 
factory 


Ttxt^MAor 


Delivery  route 
for  input  Y 


Figure  1:  The  map  for  the  nnission. 

Each  spy  is  equipped  with  a  short-range  radio  which  can  send  on/off  pulses  to  the  other.  The  enemy  is 
not  yet  aware  that  the  factory  is  being  surveilled,  and  the  more  numerous  are  the  inter-spy  pulses,  the 
greater  is  the  risk  that  the  espionage  mission  will  be  discovered.  Consequently  the  spies  want  to 
minimize  their  communication  with  each  other  subject  to  a  constraint  on  the  accuracy  of  their  production 
estimate.  The  radios'  range  is  too  short  to  reach  their  home  territory,  so  one  of  the  spies  must  physically 


message  "pipelines")  with  no  constraint  on  how  much  a  pipeline  can  be  used.  For  example  see  Hurwicz  [1960],  Mount  and  Reiter 
[1974],  Walker  [1977],  Osana  [1978],  Jordan  [1982],  and  Green  and  L^font  [1987]. 

Minimizing  the  worst-case  error  is  a  reasonable  optimality  criterion  in  circumstances  of  "complete  ignorance"  (Arrow  and  Hurwicz 
[1977])  or  in  an  infinite  risk  aversion  limit.  Its  tiactability  aids  in  breaking  new  ground,  and  results  achieved  here  generate  conjectures 
to  guide  research  under  other  criteria. 
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return  home — by  navigating  the  only  available  single-person  kayak  down  a  wild  river — in  order  to 
deliver  an  intelligence  assessment  to  their  commanders. 

Prior  to  being  deployed  in  hostile  territory  Boris  and  Natasha  meet  at  a  local  bistro  to  have  a  beer  and 
to  design  a  communication  algorithm.  Upon  what  rules  of  communication  and  decision  delegation 
should  they  coordinate?  For  some  fixed  number  n  of  pulses  who  should  send  the  first  pulse?  Should  it  be 
followed  by  a  second  pulse  from  the  same  spy  or  perhaps  by  a  premier  pulse  from  the  other?  Who  will 
be  the  Whitewater  courier  delivering  the  final  guestimate  of  explosives  production?  How  will  this 
sequence  of  pulses  and  the  identity  of  the  kayaker  depend  on  the  spies'  private  information  (i.e.  upon  the 
quantities  X  and  Y  they  each  observe)? 

Once  they  have  decided  upon  a  communication  algorithm  for  every  fixed  number  n  of  pulses  and 
determined  its  worst-case  error,  there  is  still  the  question  of  what  number  n  to  pick.  More  pulses  would 
result  in  a  greater  chance  that  their  cover  will  be  blown  but  would  also  increase  the  accuracy  of  their 
intelligence  report.  What  is  the  tradeoff?  I.e.  how  much  would  accuracy  be  improved  by  an  increase  in 
the  inter-spy  communication? 

'Lj^ampU  2:  Production  decisions  in  an  informationaCCy  decentraCized  firm 

Consider  a  ski  equipment  manufacturer  where  the  personnel  director  must  be  told  how  many  workers 
w  to  hire  for  the  upcoming  production  season.  The  relevant  information  for  the  optimal  decision  is 
distributed  throughout  the  organization:  The  production  manager  will  learn  precisely  the  productivity- 
per- worker  ar€[0,  a].  The  marketing  chief  will  learn  the  precise  value  of  a  demand  parameter /?e[0,y3]. 
(See  Figure  2.) 
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■The  Boundary  of  Economic  Expertise- 
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Director 
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Personnel 
Department 


Hires  L*  workers 


Figure  2:  An  informationally  decentralized  firm. 
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When  communication  is  costly,  the  design  problem  is  to  construct  an  efficient  procedure  by  which  the 
production  manager  and  marketing  chief  will  share  their  private  information  in  order  to  estimate  w*.' 
After  this  exchange  the  appropriate  party  can  inform  the  personnel  director  of  the  optimal  employment 
level  w*.2  The  goal  is  to  minimize  the  worst-case  departure  from  optimal  employment  over  all  possible 
productivity  and  demand  combinations. 

The  same  design  questions  arise  here  as  in  the  spy  example.  For  a  fixed  communication  cost,  who 
should  speak  first,  the  production  manager  or  the  marketing  chief?  Should  they  discuss  the  matter  in  a 
dialog?  Or  should  the  production  manager,  say,  use  the  entire  communication  allotment  to  inform  her 
panner  as  precisely  as  possible  about  productivity-per-worker  so  that  the  marketing  chief  can  then 
calculate  an  approximation  to  the  optimal  employment  level  w*?  After  this  design  problem  has  been 
solved  for  an  arbitrary  fixed  communication  cost,  the  decision  remains  about  how  much  communication 
to  perform.  What  is  the  tradeoff  between  communication  cost  and  worst-case  employment  error?  At 
what  point  does  it  no  longer  pay  for  the  two  informed  parties  to  continue  to  talk  to  one  another? 

On  interpretation 

The  phenomena  with  which  we  are  concerned  are  communication,  coordination,  and  decision 
delegation.  Of  course  we  realize  that  people  do  not  communicate  by  exchanging  bits  in  order  to  locate 
each  other  one-dimensionally  on  an  interval,  that  the  problem  of  an  organization  is  not  to  announce  a 
value  which  minimizes  the  worst-case  error  of  approximation  to  a  function,  and  that  incentive 
compatibility  is  a  crucial  constraint.  We  do,  however,  see  our  model  as  a  usefully  interesting,  yet 
tractable,  metaphor  for  more  complex  situations. 

The  more  general  scenario  we  have  in  mind  is  two  people  seeking  to  arrive  at  a  consensus  action 
which  will  bring  them  satisfaction  if  the  action  is  appropriate  to  their  common  situation  and  some  pain  if 
it  is  not.  The  focus  of  our  research  is  the  process  by  which  two  individual  and  privately  known  situations 
become  a  more  nearly  commonly  known  situation.  That  is  to  say:  how  do  two  agents  come  to  share  an 
understanding  of  the  state  of  the  world? 

We  beheve  that  there  is  no  way  in  which  a  human  being  can  express  to  another  the  full  meaning  of 
her  situation.  We  claim  that  a  person  has  no  method  of  letting  someone  else  know  everything  that  she 
perceives,  believes,  remembers,  guesses,  understands,  fears,  hopes,  and  doubts  about  both  the 
environment  and  herself  in  a  finite  amount  of  time.  However,  action  must  be  taken  after  only  a  finite 
conversation,  so  we  must  be  concemed  with  efficient  communication. 

For  example,  the  production  manager  does  not  only  know  the  normal  productivity  of  different  inputs 
for  a  given  technology.  She  also  knows  the  reUability  and  morale  of  the  workers,  the  probabiUty  that  the 


The  personnel  director  rarely  attended  his  economics  classes  and  would  not  know  how  to  compute  the  optimal  employment  level  even  if 
he  were  perfectly  informed  about  a  and  fi\  thus  he  is  outside  the  boundary  of  economic  expertise.  Consequently  we  need  not  consider 
algorithms  in  which  the  production  manager  and  the  marketing  chief  both  communicate  approximations  of  their  private  information 
directly  to  the  personnel  director. 

This  communication  with  the  personnel  director  is  also  costly.  However,  we  assume  that  it  is  equally  costly  regardless  of  whether  it  is 
done  by  the  production  manager  or  the  marketing  chief  Therefore  it  is  a  fixed  cost  in  the  design  problem. 
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machines  will  break  down,  the  necessary  maintenance  schedule,  the  gossip  about  new  technical 
developments,  etc.  (Moreover,  she  has  a  lot  of  background  beliefs  of  which  she  may  not  even  be  aware!) 
We  assen  that  she  cannot  transmit  all  this,  so  she  must  use  her  language  providently  when  she 
coordinates  her  actions  with  others. 

In  our  model  we  do  not  attempt  to  replicate  a  complicated,  multidimensional  "common  situation;"  we 
consider  points  in  a  rectangle.  On  the  other  hand,  we  need  not  endow  the  agents  with  a  rich  context- 
dependent  natural  language;  we  have  zeros  and  ones.  In  our  opinion  tractability  justifies  the  simplicity 
we  impose  when  we  project  what-to-communicate  and  how-to-communicate  onto  a  conceptually 
manageable  space.  We  can  then  address  issues  of  syntax  (the  rules  of  communication)  and  semantics 
(the  meaning  of  the  messages)  within  an  elegant  mathematical  formalism. ' 


3.  The  Model 


A  team  has  two  partners:  X  and  Y.  They  have  a  joint  address  6  =  idx,dY)  =  (x,y)  on  a  rectangle 

E  =  ExxEy,  (A.l) 

where  Ex  and  Ey  are  the  closed  intervals 

Ex  =  [a,bl  (A.2a) 

Ey^icd].  (A.2b) 

A  real-valued  optimal  action  function  is  defined  and  bounded  on  the  rectangle  E,  viz.  q>:  E—>  R,  and  is 
known  to  both  players.  Ideally  the  team  would  take  the  optimal  action  (p(6)  corresponding  to  their  joint 
address.  However,  each  partner  knows  only  a  projection  of  6;  X  knows  the  precise  value  of  x  and  Y 
knows  the  precise  value  of  y.  Initially  X's  only  information  about  K's  location  is  that  ysEy.  Similarly,  Y 
knows  only  that  xeEx- 

The  partners  may  communicate  by  sending  binary  digits  {bits)  to  one  another.  One  of  the  players  then 
chooses  an  action  y/eE  which  approximates  the  desired  action  (p(6).  The  communication  process  and 
the  choice  of  action  is  dictated  by  a  communication  algorithm  R,  which  is  understood  by  both  players. 
The  algorithm  R  must  specify  the  instigator — the  parmer  who  initiates  the  process  by  either  immediately 
selecting  an  action  or  by  sending  the  first  bit — as  well  as  who  sends  what  bits  to  whom  and  when  and 
who  chooses  the  ultimate  action  and  when.^  Because  knowledge  of  x  and  y  is  privately  held  by  X  and  Y, 
respectively,  partner  Ts  activities  under  the  algorithm  R  can  depend  only  on  her  own  coordinate  ^,  and 
on  the  history  of  bits  sent  and  received;  they  carmot  depend  explicitly  on  her  partner's  coordinate  dj. 
Without  loss  of  generality  we  assume  that  the  two  parmers  never  send  simultaneous  messages. 


Our  strategy  of  radical  simplification  seems  to  be  vindicated  by  our  work-in-progress  using  a  multidimensional  information  structure  for 
each  partner.  We  have  found  interesting  multidimensional  problems  which  decompose  into  separate  one-dimensional  problems  and 
which  therefore  directly  require  the  techniques  we  develop  in  this  paper. 

Because  of  the  information  partition  the  identity  of  the  instigator  must  be  a  property  of  the  algorithm  which  is  independent  of  9. 
(Otherwise  there  would  be  cases  in  which  both  players  tried  to  instigate  or  neither  instigated.) 
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For  a  given  6  the  communication  algoritlim  R  prescribes  a  particular  exchange  of  some  number  T](d) 
of  bits  and  results  in  some  action  y/id).  The  communication  length,  n,of  R  is  the  number  of  bits  sent  in 
the  longest  possible  exchange,  i.e. 

n  =  maxr](,d).  (A. 3) 

Note  that  we  are  adopting  a  worst-case  definition  of  the  communication  length  of  the  algorithm  R.  The 
error  of  the  algorithm  R  is  also  a  worst-case  formulation: 

£=sup\y/(e)-(p(e)\.  (^4) 

Without  loss  of  generality  we  assume  that  y/  is  bounded  on  the  unit  square  to  ensure  that  the  error  as  a 
function  of  ^  is  bounded  and  therefore  that  the  error  of  R  is  defined. 

The  designer's  problem  is  to  specify — in  ignorance  of  which  6  will  be  realized — communication 
algorithms  which  are  efficient  with  respect  to  communication  length  and  error  of  approximation.  An 
algorithm  R  is  efficient  if  there  exists  no  other  algorithm  which  achieves  either  a  strictly  lower  error  with 
a  weakly  lower  communication  length  or  a  strictly  lower  communication  length  with  a  weakly  lower 
error.  We  wish  to  characterize  for  a  given  function  ^  the  efficient  frontier  of  achievable 
(error,  communication  length)  pairs,  i.e.  those  which  result  from  efficient  algorithms.  We  need  not  find 
all  of  the  efficient  algorithms;  we  need  locate  only  a  subset  sufficient  to  demarcate  the  frontier. 

For  a  given  joint  location  6,  the  algorithm  R  will  ultimately  call  upon  one  of  the  partners  to  be  the 
actor  for  6.  We  characterize  the  actor's  knowledge  about  9  at  that  time  by  specifying  the  actor's 
information  set:  the  minimal  subset  S  czE  within  which  she  knows  6  to  lie.  Given  her  knowledge  that 
0€5,  the  actor  knows  that  the  optimal  action  <p(d)  belongs  to  the  image  of  5  under  (p: 

(piS)={(pidy.eeS].  (a.s) 

The  actor's  best  (i.e.  worst-case  error  minimizing)  approximation  action  given  S  is  the  midpoint  of  (p(S), 
y/(S)  =  ^Onf  (PCS)  +  sup  (fKS)),  (A.6) 

which  results  in  the  worst-case  error  given  S, 

where  we  denote  the  function's  oscillation  over  5  by 

A(p{S)  =  supqKS)-\ni<p{S).  (A.S) 

We  observe  that  adding  points  to  S  would  weakly  increase  the  supremum  and  weakly  decrease  the 
infimum  and  thereby  weakly  increase  the  oscillation  of  g>  over  S;  therefore  the  error  over  a  strictly  larger 
set  is  weakly  greater  than  the  error  over  the  subset. 
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In  section  3  we  discuss  monolog  algorithms — those  which  dictate  that  all  communication  is 
performed  by  the  instigator.  We  restrict  attention  to  optimal  action  functions  which  satisfy  appropriate 
monotonicity  and  continuity  assumptions.  For  a  given  communication  length  we  compute  in  "almost 
closed  form"  the  error  and  structure  of  algorithms  which  are  efficient  within  the  class  of  monologs.  We 
begin  section  4  by  presenting  two  examples  which  demonstrate  that  monolog  need  not  be  sufficient  for 
efficiency — a  dialog,  in  which  both  partners  speak,  can  outperform  a  best  monolog.  We  then  establish  a 
sufficient  condition  for  monolog  efficiency:  additive  separability  of  the  optimal  action  function.  We 
show  that  this  condition  is  not  necessary  by  exhibiting  a  rectangle  on  which  a  multiplicatively  separable 
optimal  action  function  can  be  efficiently  approximated  through  the  use  of  a  monolog.  In  section  5  we 
show  that  all  of  our  results  are  valid  even  when  the  monotonicity  requirements  are  dropped  as  long  as 
the  optimal  action  function  is  appropriately  separable.  Section  6  summarizes  the  paper  and  gathers  our 
final  thoughts. 

3.  Monolog  algorithms 

We  will  find  it  useful  to  study  a  simple  subset  of  the  set  of  communication  algorithms:  the  set  of 
monolog  communication  algorithms.  Such  an  algorithm  stipulates  that  for  all  6  any  transmission  of  bits 
is  performed  only  by  the  instigator.  If  parmer  /,  say,  is  the  instigator,  she  either  immediately  chooses  an 
action  y/(di)  or  chooses  from  among  2"  messages.  (In  a  monolog  any  bits  sent  are  transmitted  as  an 
uninterrupted  string.)  If  the  instigator  chooses  message  k,  then  her  partner,;',  chooses  an  action  y/Oj^K) 
which  depends  on  his  coordinate  and  the  message  he  received. 

For  defmiteness  we  consider  here  monologs  in  which  X  is  the  instigator.  A  parallel  analysis  applies  to 
y -instigator  monolog  algorithms.  In  order  to  find  an  algorithm  which  is  efficient  within  the  class  of 
monologs  we  would  find  a  monolog  which  is  efficient  within  the  class  of  X-instigator  monologs  and  a 
monolog  which  is  efficient  within  the  class  of  K-instigator  monologs.  An  algorithm  with  the  lower  error 
would  be  efficient  within  the  broader  class  of  monolog  algorithms.  For  the  remainder  of  our  discussion 
of  X-instigator  monologs  we  will  use  the  term  "efficient"  as  a  shorthand  to  mean  "efficient  within  the 
class  of  X'-instigator  monolog  algorithms." 

We  first  describe  a  monolog  as  a  partition  of  Ex  and  discuss  the  role  of  immediate  action.  We  give 
conditions  under  which  we  can  restrict  attention  to  partitions  whose  cells  are  convex.  We  discuss  the 
efficiency  and  existence  of  partitions  which,  loosely  speaking,  yield  a  common  error  in  every  cell.  We 
then  solve  the  efficient  monolog  problem — computing  both  the  error  and  the  algorithm  itself — for  a 
large  class  of  optimal  action  functions  and  discuss  the  comparative  statics  of  increasing  the 
communication  budget. 


Monolog  algorithms  as  partitions 


We  can  fully  define  a  monolog  by  a  (2"  +  1  )-cell  partition  of  Ex — which  specifies  the  circumstances 
under  which  the  instigator  should  immediately  act  or  send  a  particular  message — and  by  choice-of- 
action  rules  which  specify  how  each  partner  would  choose  an  action  if  called  upon  to  do  so.  We  define 
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The  (w  +  1  )-cell  partition  of  Ex  is  denoted 
cr={cro,cTi a„}, 

i.e.  where  each  <T,c£x.  ^i&M^  <^i  =  Ex,  and  air\Cj  =  (d  when  i^j. 
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(B.l) 
(B.2) 
(B.3) 


(B.4) 


The  cell  (To  is  the  immediate  action  cell.  When  x^aQ,X  is  called  upon  to  immediately  take  an  action. 
She  knows  only  that  de{x]xEY  and  therefore  that  the  optimal  action  lies  in  the  image  (pi{x]y.EY).  She 
takes  the  action  \i/{[x)xEy)  specified  in  (A. 6),  which  results  in  the  error  £({.r}  xfy)  from  (A. 7).  (See 
Figure  3.)  The  error  £o  for  the  immediate  action  cell  is 


eo= 


0, 


o'o  =  0- 


(B.5) 


Adding  points  to  cTq  could  only  increase  the  supremum  in  (B.5);  therefore  the  error  of  immediate  action 
would  be  weakly  greater  for  a  strictly  larger  immediate  action  cell. 


x)xEy 


<p{{x}y.EY) 

\  sup(p{{x]y.EY) 

U{a:}x£k) 


y/{[x}><.EY)-   * 


)e{{x}xEY') 
\r\\(pi{x)  xEy) 


(b) 


(c) 


Figure  3:  (a)  The  set  of  6,  (b)  the  image  under  cp,  and  (c)  the  best-approximation 
and  the  worst-case  error  when  X  immediately  announces  at  x. 

The  remaining  m  cells  are  the  message  cells.  When  x € <7,,  ieM,X  sends  the  message  /  in  order  to 
inform  Y  that  xea,.  Y  then  knows  that  ^€cr,x  (y}  and  therefore  takes  the  action  y/(aiX  {y}),  which 
results  in  the  error  e(cT,x  {y}).  (See  Figure  4.)  The  worst-case  error  e,  when  X  sends  message  /  is 


e,=  sup£(cT,x{y}). 

y&Er 


(B.6) 
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If  we  were  to  add  points  to  a,,  then  for  each  y  the  error  £:(cr,  x  (y})  would  weakly  increase,  causing  the 
supremum  of  these  errors  to  weakly  increase  as  well;  therefore  the  error  of  a  strictly  larger  message  cell 
is  weakly  greater  than  the  Tror  of  a  smaller  message  cell.  If  there  exists  a  y*  at  which  the  worst-case 
error  is  achieved,  i.e.  such  that 


y*€arg  max  eccr, x  {y}), 


we  say  that  y*  is  a  worst-case  y  value  for  <t,. 


O-/  x{y} 


<pix,y) 

A 


9(<^i  x{y}) 

\      iSUp(p(c7,-  x{y}) 


y/iCTi  x(y})- 


i-^"  X 


e(C7,  X{y}) 

£((^i  X  {y}) 
inf^((T,-  x{y}) 


0-, 

(b) 


(0 


(B.7) 


(B.8) 


Figure  4:  (a)  The  set  of  9,  (b)  the  image  under  cp,  and  (c)  the  best-approximation 
and  the  worst-case  error  when  X  sends  message  /  for  a  given  y. 

The  error  of  the  algorithm  a  is  the  largest  individual  cell  error,  i.e. 
£=max(£Q,£i,...,eJ. 

The  designer's  task  is  to  choose  the  partition  a  of  Ex  which  minimizes  e. 

Immediate  action  and  efficiency 

If  the  immediate  action  cell  ao  of  an  algorithm  partition  <T  is  empty,  we  say  that  a  is  a  1  — >  m 
partition;  if  <yo*0.  we  say  that  O"  is  a  O^m  partition.  When  does  efficiency  require  an  empty 
immediate  action  cell?  To  answer  this  question  we  now  turn  to  a  significant  difference  between  an 
immediate  action  cell  and  a  message  cell:  If  a  message  cell  a,,  ieM,  is  a  singleton,  i.e.  consists  of  a 
single  point,  its  associated  error  £,  is  zero  because,  after  receiving  message  i,  Y  would  know  precisely 
not  only  his  own  coordinate  y  but  X's  coordinate  x  as  well.  Therefore  if  some  message  cell  is  empty,  it 
would  not  increase  the  error  of  the  algorithm  if  we  were  to  transfer  a  single  point  to  the  empty  message 
ceU  from  another  cell.  Therefore  efficiency  can  never  require  that  some  message  ceU  be  empty.  On  the 
other  hand  if  the  immediate  action  cell  Cq  is  a  singleton,  say  <To=  {x},  the  associated  error  e({x}  xEy) 
can  be  positive  because  X  has  uncertainty  about  Y's  coordinate. 

Consider  a  1  — >  m  partition  cr  whose  error  is  e.  If  for  some  i€£x  the  error  of  immediate  action  is  less 
than  the  error  of  the  1  — >m  partition,  i.e.  if  £({jc}  x£y)<e,  then  it  would  weakly  decrease  the  algorithm's 
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error  if  we  removed  i  from  its  message  cell  and  added  it  to  the  previously  empty  immediate  action  cell. 
We  would  say  the  1  -^  m  partition  is  improvident  because  it  squanders  its  opportunity  to  weakly  reduce 
its  error  through  the  incorporation  of  a  nonempty  immediate  action  cell.  If  on  the  other  hand  for  all 
vefx  the  immediate  action  error  exceeds  the  error  of  the  1  -»m  partition,  then  it  would  strictly  increase 
the  algorithm's  error  to  incorporate  even  a  single  point  into  its  empty  immediate  action  cell.  In  this  case 
we  say  that  the  1  — >  m  partition  is  provident;  eschewing  immediate  action  is  efficient. 

Convex  algorithm  partitions 

We  now  consider  convex  algorithm  partitions — those  of  the  form  a=  (cTo.cti (T„}  such  that  each 

cr,,  (€M^ ,  is  a  convex  set  (i.e.  an  interval).  Convex  partitions  are  especially  tractable  compared  to  more 
general  partitions  of  [a.b]  because  they  can  be  almost  completely  specified  by  a  set  of  m  numbers 
representing  the  cells,  e.g.  (7o=[a,xo],ai  =  iXi.{,Xi],  i€M,x„  =  b.^  In  order  to  take  advantage  of  their 
structural  simplicity  we  will  in  Theorem  1  impose  monotonicity  conditions  upon  the  optimal  action 
function  cp  which  guarantee  that  we  can  without  loss  of  generality  restrict  attention  to  convex  partitions; 
I.e.  if  presented  with  a  partition  which  is  not  convex,  we  can  exhibit  a  convex  partition  whose  error  is 
weakly  less  than  that  of  the  original  partition.  (These  monotonicity  assumptions,  as  well  as  additional 
assumptions  imposed  later,  are  not  as  restrictive  as  they  might  seem.  We  wiU  see  in  Theorem  8  that  all 
of  our  results  can  be  easily  extended  to  a  much  wider  class  of  optimal  action  functions.) 

Conditions  under  whicft  convene  partitions  are  sufficient 


Theorem  1 


For  each  >'e£y,  let  (p{x,y)  and  e(\x}  xEy)  be  weakly  monotonic  functions  of  x 
on  Ex,  and  let  cr  =  {  cTq,  (Ti  , . . . .  (T;„  }  be  a  partition  of  Ex  whose  ertor  [from  (B .8)] 

is  £.  Then  there  exists  a  convex  partition  (T=  {(to,  d"i &„}  of  Ex,  whose  error  we  denote  by  e,  such 

that  0  The  error  of  cr  is  no  greater,  i.e.  £<e;  and  (D  if  the  immediate  action  interval  do  is  nonempty,  it 
will  be  at  the  left  (respectively  right)  end  of  Ex  if£({x}  xEy)  is  weakly  increasing  (respectively  weakly 
decreasing),  i.e.  &oBa  (respectively  dosb). 


Proof 


For  simplicity  of  exposition  assume  that  (p  and  e(  {  • }  x  Ey)  are  continuous  and  that  the 
monotonicity  of  (p(,-,y)  has  the  same  sense  for  all  yeEy-^  For  definiteness  assume  that  both  of  the 
monotonicity  assumptions  are  satisfied  in  the  weakly  increasing  sense.  First  we  demonstrate  claim  ©.  If 
an  algorithm  specifies  an  immediate  action  for  some  xq,  the  monotonicity  of  £({j:}  xEy)  implies  that  the 
error  of  immediate  action  for  any  x<xo  would  be  weakly  lower;  therefore  the  error  of  the  algorithm 
would  not  be  increased  by  incorporating  within  the  immediate  action  cell  all  values  in  the  interval  [a,xo]. 
Therefore  we  can  replace  (Tq  by  [a,  SupcTo]  without  increasing  the  error  of  immediate  action. 

Now  we  show  ©.  Consider  the  i-th  message  cell.  Thanks  to  monotonicity  (and  temporarily  assumed 
continuity)  we  have 


These  m  numbers  do  not  themselves  specify  the  particular  closedness  decisions  for  each  cell.  When  we  make  an  additional  continuity 
assumption,  the  significance  of  the  closedness  decisions  is  nil. 

More  general  and  more  detailed  versions  of  the  proofs  offered  here  appear  in  Kofman  and  Ratliff  [1991a]. 
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inf  ^<T/ X  I}- } )  =  ^Kinf  <T„  ><). 

Therefore  for  any  yeEy,  the  error  when  Y  acts  after  receiving  message  /  is 
e«7,x  {y } )  =  i((p(sup cT,-, >)  - (p(inf  cT,-, >-)), 

whose  only  dependence  upon  cr,  is  through  its  supremum  and  infimum,  which  would  be  unchanged  if  we 
convexified  this  set.  Because  £(cr,  x  {y})  would  be  unchanged  for  every  y  by  convexification  of  O",,  the 
error  £,  from  (B.6)  would  be  unchanged  as  well.  We  construct  a  by  refining  the  covering 
(cotTo.cocTi CO  cTn,}  in  such  a  way  that  each  (t,  is  convex  and  (7, c COCT,.  Q 


The  importance  of  monotonicity  is  exemplified  in  Figure  5  in  which  two  optimal  action  functions 
^mono  ^j  (^°",  horizontally  monotonic  and  nonmonotonic,  respectively,  are  plotted  as  functions  of  x 
for  an  arbitrarily  chosen  y.  The  nonconvex  subset  <T,  is  the  union  of  two  disjoint  intervals.  In  the  lower 
half  of  the  figure — the  horizontally  monotonic  case — it  is  obvious  that  convexifying  (7i  does  not  increase 
the  oscillation  of  <p"^°"°  and  therefore  does  not  increase  the  error,  i.e.  A^(cr,x  {y})  =  A(p(co  cr,  x  (y})  and 
therefore  e((T,x  {>'})=e(co<T,x  [y}).  In  the  upper  half  of  Figure  5  the  image  of  the  nonmonotonic  ^"°" 
over  (T,-  is  a  single  point  and  hence  the  oscillation  and  error  are  zero.  After  convexification  the  image  is 
the  indicated  segment  of  values,  and  therefore  the  oscillation  and  error  are  positive;  i.e.  the  inclusion  of 
new  points  through  convexification  results  in  an  increased  error  over  the  new  domain,  i.e. 
e(cocT,x(y})>£(a,x(y}). 


(p(CO(JiX{y}) 
(p(OiX{y}) 

<p(o-/x{y}) 


COCT, 


Figure  5:  The  worst-case  error  for  a  horizontally  nnonotonic  function  is  unaffected  by 
convexification;  this  need  not  be  true  in  the  absence  of  monotonicity. 

Sufficient  conditions  for  tfu  monotonicity  of  the,  immediate  action  error 

The  requirement  that  (p  be  weakly  horizontally  monotonia  is  easily  interpreted;  the  requirement  that 
e{{x}  xEy)  be  a  weaikly  monotonic  function  of  jc  (which  is  indirectly  a  restriction  upon  <p)  does  not  have 
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as  obvious  an  interpretation.  The  next  theorem  states  sufficient  conditions  when  (p  is  appropriately 
differentiable  which  guarantee  the  monotonicity  of  the  immediate  action  error. ' 


Theorem  2 


Let  (p(- ,  y)  be  either  0  a  weakly  increasing  function  for  all  yeEy  oxQ)  a 
weakly  decreasing  function  for  all  y^Ey.  Further  let  (p{x,  •)  be  either  ©  weakly 

increasing  for  all  xeEx  or©  weakly  decreasing  for  all  xsEx-  Let  dcp/dx,  d(p/dy,  and  d^^/dxdy  exist  and 

be  continuous  over  E.  Then  the  following  two  statements  are  equivalent: 

©        d'(p/dxdy  has  weakly  constant  sign  over  E  (i.e.  nonnegative  over  E  or  nonpositive  over  E). 

@         The  errors  e<{x}  xcTy)  and  £((7x>^  (v))  ^e  weakly  monotonic  functions  of  x  and  y.  respectively, 
for  every  choice  of  subsets  Cx'^Ex  and  OyClEy- 

Specifically,  when  the  mixed  partial  is  { nonnegative  |  nonpositive } ,  X's  error  e({x}  xcTy)  is  weakly 
( increasing  I  decreasing }  as  cp  is  vertically  weakly  ( increasing  |  decreasing }  and  K's  error  e((TxX  {j})  is 
weakly  { increasing | decreasing }  as  (^  is  horizontally  weakly  {increasing [decreasing}. 


Proof    I  We  will  first  show  that  ©  implies  the  first  claim  of©  when  (Pxy'^Q  and  (py>Q.  The  other  cases 
are    demonstrated    similarly.    Let    inf(Tj'  =  c   and  SWQ  ay  =  d .    Because   q>y>Q,  ei{x\y.aY) 
=  ^((p(x.d)-<p(x,c)).  This  is  weakly  increasing  in  x  because  <Px(.x,d)><Px(x,c)  for  all  xeEx  thanks  to 
(Pxy  > 0.  (The  continuity  of  (Pxy  is  used  to  guarantee  that  (p^y -  (Pyx-) 

We  now  show  that  ©  ^>  ©  for  the  case  in  which  (py  >  0  and  in  which  e({x}  x  Cy)  is  weakly  increasing. 
Because  £{{x}  xay)  is  weakly  increasing,  we  know  that  (Px(x,d)>g>x(x,c)  for  a]lx€Ex  and  all  ayczEy. 
Therefore 

<Pxyix,y).\\m  <P.ix,y^h)-cpxU.y) 
for  all  x€  £x  and  y  €  int  Ey.  The  result  extends  to  all  of  £  due  to  the  continuity  of  q)xy.  © 

Equal-error  convex  partitions 

For  the  remainder  of  our  analysis  of  monolog  algorithms  we  restrict  attention  to  optimal  action 
functions  (p  which  satisfy  the  hypotheses  of  Theorem  1.  Therefore  we  can  without  loss  of  generality 
confine  our  consideration  to  convex  partitions. 

We  now  direct  our  analysis  toward  those  convex  partitions  which  result,  loosely  speaking,  in  the  same 
worst-case  error  in  every  cell.  Theorem  3  establishes  conditions  under  which  we  can  be  certain  that  any 
given  "equal-error"  convex  panition  is  efficient.  Theorem  4  establishes  conditions  under  which  we  can 
be  certain  that  such  an  equal-error  convex  partition  exists. 


'  The  implications  of  these  sufficient  conditions  are  obviously  stronger  than  we  need  to  apply  Theoretn  1  to  X-instigator  algorithms:  they 
are  applicable  when  Y  instigates  as  well.  In  addition,  these  conditions  will  be  sufficient  for  the  satisfaction  of  stronger  hypotheses  of 
later  theorems. 
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We  say  that  cr  is  an  equal-error  0—>m  partition  if 

£o=£\  =  --=£m  and    cro^0.  (B.9) 

We  say  that  cr  is  an  equal-error  l^>m  partition  if 

ei  =  ■■■=£„  and    (To  =  0.  (B.IO) 

We  say  that  a  is  an  equal-error  partition  if  it  is  either  an  equal-error  0  ^  m  partition  or  an  equal-error 
1  — >  m  partition.  We  have  previously  defined  the  terms  provident  and  improvident  with  respect  to  1  — » m 
partitions.  We  now  defme  any  equal-error  0  — >  m  partition  to  be  provident. 

If  a  conve?(j  provident,  ecjuaC-error  partition  ep(ists,  it  is  efficient 

Theorem  3  will  tell  us  that  if  a  convex,  provident,  equal-error  partition  exists  then  it  must  be  efficient; 
i.e.  no  other  partition  with  the  same  communication  length  could  possibly  have  a  lower  error.  As  an 
example,  consider  a  convex  partition  with  two  message  cells,  c7=  {cTl.<7r},  where  (Ji  =  [a,x\)  and 
a'R  =  [xi,b],  which  we  assume  to  be  equal-error  with  £  =  £i  =  £^.  Assume  further  that,  for  all  xeEx, 
£({x}  xEy)  is  sufficiently  large  that  this  partition  is  provident  (i.e.  efficiency  does  not  require  a 
nonempty  immediate  action  cell).  Now  consider  any  other  convex  partition  with  two  message  cells, 
o={ai,&n}y  with  error ^=  max  {Si, in).  Let  its  intercell  boundary  be  jl\.  This  new  partition  cannot  have 
a  lower  error.  If  it  did  we  would  have  both  £i<£  and  eq<e.  In  order  to  make  £i<£,  we  would  have  to 
choose  X\<x\  in  order  to  shrink  cTl  relative  to  <Tl.  However,  thus  shrinking  <Tl  makes  (Tr  a  larger  set  than 
(Tr  and  therefore  £R>e. 


Theorem  3 


For  each  yeEy,  let  qKx,y)  and  £({x}  xEy)  be  weakly  monotonic  functions  of  x 
on  Ex-  Let  (T  be  a  provident,  equal-error,  convex  partition.  Then  there  does  not 


exist  a  panition  (whether  0  ->  m  or  1  — >  m)  which  has  a  lower  error. 


Sketch 

of 
Proof 


The  proof  of  Theorem  3,  given  in  full  in  Kofman  and  Ratliff  [1991a],  relies  upon  a  lemma 
which  states  that  if  <j  and  &  are  two  distinct  partitions  with  the  same  number  of  cells,  then 
some  cell  d,  of  the  second  partition  is  a  strictly  larger  set — and  has  a  weakly  larger  error — than 
some  cell  a^  of  the  first  partition.'  (If  <t  is  a  provident  1  — >m  partition,  then  any  potentially  lower-error 
a  would  also  have  to  be  a  1  -»m  partition  because  otherwise  £>£o > inf;c6[a,t]e((x}  x Ey)>£.)  Therefore 
the  error  of  this  different  partition  will  be  at  least  as  large  as  the  common  error  of  the  equal-error 
partition.  Q 


Actually,  a  stronger  claim  is  shown:  that  either  the  left-hand  cell  of  (J  is  strictly  larger  than  the  left-hand  cell  of  cr,  the  right-hand  cell  of 
d  is  strictly  larger  than  the  right-hand  cell  of  CT,  or  one  of  the  interior  cells  of  (>  is  strictly  larger  than  one  of  the  interior  cells  of  cr.  This 
assures  us  that  we  are  not  comparing  errors  based  on  a  subset  relationship  between  a  message  cell  and  an  immediate  action  cell. 
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Continuity  impCies  e?(istence  of  an  equaC-error  partition 

By  requiring  continuity  of  the  optimal  action  function  we  are  guaranteed  that  an  equal-error  partition 
exists. 


Theorem  4 


Let  (p  be  continuous  and  for  every  y^Ey,  let  (p(x,y)  and  s{{x}  xEy)  be  strictly 
monotonic  functions  of  x.  Then  for  all  positive  integers  m,  there  exists  a  unique 
equal-error  1  — >  m  partition.  If  this  1  -^  m  partition  is  improvident,  then  there  also  exists  a  unique  equal- 
error  0— >m  partition.  (Therefore  there  always  exists  a  provident  equal-error  partition.)  The  error  of  this 
0  — >  /7T  partition  is  weakly  less  than  the  error  of  the  1  — » m  partition. 


Sketch 

of 
Proof 


This  theorem  probably  seems  intuitively  plausible  to  the  reader.  The  full  proof,  which  is  given 
in  Kofman  and  Ratliff  [1991a],  has  its  conceptual  substrate  in  the  following  iterative 
procedure,  where  for  definiteness  we  assume  that  the  \  -^  m  partition  is  provident  (the 
modification  for  the  0  — >  m  case  is  straightforward):  Fix  some  sufficiently  small  £  and  find  the  endpoint 
x\  of  0\  =  [0,x\)  such  that  £i  =e  [see  (B.6)].  Now  find  the  endpoint  X2  of  aj  =  [x\,x2)  such  that  £2  =  £• 
Continue  in  this  fashion  until  either  Q)  x„=l  is  determined,  (^Xm<l  is  determined,  or  ©  the  right-hand 
boundary  of  [a,b]  is  reached  and  there  are  ceUs  tr,,  i<m,  remaining  which  have  yet  to  receive  their  £ 
error  allotment.  In  case  ©,  the  desired  unique  equal-error  1  -^m  partition  has  been  found.  In  cases  © 
and  ©,  e  must  be  increased  or  decreased,  respectively,  and  then  the  process  repeated.  By  choosing  e 
sufficiently  small  or  large,  respectively,  we  can  achieve  cases  ®  and  ©.  The  continuity  hypotheses 
guarantee  that  x^(e)  is  a  continuous  function  of  e  and  therefore  that  a  value  for  e  can  be  found  which  will 
result  in  the  desired  case  ©.  3 

Theorem  4  is  an  existence  result.  However,  it  is  easy  to  see  how  the  iterative  procedure  described  in 
the  above  sketch  of  the  proof  provides  an  algorithm  to  actually  compute  an  efficient  monolog  algorithm 
given  an  optimal  action  function  satisfying  the  assumptions  of  the  theorem. 

The  theorem  suggests  the  following  strategy  to  find  an  efficient  algorithm:  First  find  an  equal-error 
1  ->  m  partition  and  determine  its  error  i.  To  determine  whether  this  partition  is  provident  find  the 
minimum  error  of  immediate  action.  If  this  error  is  weakly  larger  than  e,  the  1  — >  m  partition  is  efficient. 
Otherwise,  find  an  equal-error  0  — » m  panition;  it  will  be  efficient. 

Monolog  error  under  horizontal  additivity 

The  task  of  finding  an  efficient  monolog  is  greatly  eased  when  the  optimal  action  function  satisfies  an 
additional  vertical  monotonicity  assumption.  (Satisfaction  of  this  assumption  is  implied  by  the  constant- 
signed  mixed-partial  requirement  of  Theorem  2.)  We  can  then  determine  the  error  over  any  message  cell 
ffi,  i€M,  simply  by  evaluating  the  optimal  action  function  (p  along  a  single,  worst-case  horizontal 
segment.  This  simplification  yields  "almost  closed  form"  expressions  for  the  error  and  the  cell 
boundaries  of  an  efficient  partition. 
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Theorem  5 


Let  (p  be  continuous  and  let  (p{-,y)  be  either  weakly  increasing  for  all  y&Ey  or 
weakly  decreasing  for  all  y&Ey.  Further,  let  cp  be  such  that  the  error,  given  y, 
when  X  sends  message  /,  viz.  e(<T,x  (y }),  is  either  ©  weakly  increasing  in  y  for  all  ffjCzEx  or  ©  weakly 
decreasing  in  y  for  all  <T,c£x-  Define  the  worst-case  y-value y*  =  d\n  case  ©  and  y*  =  c  in  case  ©. 

O     There  exists  an  equal-error  1  -^  m  partition  a  of  [a.  b]  whose  ertor  is 

^J<p{b.y^)-(p(ay)\  ^5  j^ 

2m 

Define  ^x)  =  (p{x,y*).  The  cell  boundaries  of  c  are  determined  (uniquely  if  the  horizontal  monotonicity 
of  ^  is  strict)  by 

supcTie<p~V<^(nnincri)±2H),  (5.2) 

keM,  and  by  min  G\  =  a,  where  the  sense  of  the  "±"  corresponds  to  (p{- ,y*)  increasing  and  decreasing, 
respectively.^ 

©  Let  e({a:}  x£y)  be  a  weakly  monotonic  function  of  j:.  Define  the  best  immediate  announcement 
point  ibesi  =  a  (respectively  ib8si  =  6)  if  f((jc}  x£y)  is  weakly  increasing  (respectively  weakly  decreasing). 
Define  the  immediate  action  ertor  function  to  be 

If  £^o(^bGsi)  ^  £.  then  a  is  efficient.  Otherwise,  there  exists  an  equal-error  0->m  partition  6  oi[a,b]. 
Defme  x„  =  b  (respectively  i;„  =  a)  when  ibesi  =  ^  (respectively  ib«st  =  ^)  ^nd  define  the  message  cell  ertor 
function  to  be 

^^^^_\<p(i.,y*)-(p(xo,y*)\  (5  4) 

2/n 

There  exists  xoeEx  such  that  eo(^o)  =  £(^o).  and  the  error  of  d  is  £  =  £o(xo)  =  £(xo)<e.  The  immediate 
action  interval  is  d"o=  CO  {ib«»i.-^}-  The  cell  boundaries  of  an  efficient  partition  are  again  determined  by 
(5.2) — where  i  is  replaced  by  i — and  by  min  <Ti  =  io  (respectively  min  a\  =  a)  when  e({x}  xEy)  is 
weakly  increasing  (respectively  weakly  decreasing). 


Proot    I  From  the  definition  of  y*  and  the  monotonicity  of  <p{-,y)  and  £(CT,  x  { • }),  we  see  from  (B.6) 


that  the  ertor  over  message  cell  i  is 

e,=  sup£(cT,x{y})=£(<T,x{y*})  =  i-|^sup(T,-,y*)-^infcr„y*)|. 
ysEr  2 

The  partition  is  equal-ertor,  so  the  common  message-cell  error  is 


'       We  use  ^"'  to  denote  the  inverse  image  of  ^,  viz.  ^'(z)=  \x^Ex-  ^•'^)  =  z} 
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e  =  ;^ y  e,-=  -^ S  I ^sup  Gi, >*) - ^(inf  (T„ >*)|  =  -L| ^sup a^. y*) - <p(^r\\a\,y*^ |. 

where  the  message  cells  are  labeled  such  that,  for  all  iJeM,  /<;'<=>inf  ct,s  inf  cTy.  This  establishes  (5.1) 
and  (5.4).  The  error  over  the  immediate  action  cell  from  (B.5)  is  given  by  (5.3).  The  equal-error 
requirement  then  determines  io-  The  boundaries  defined  by  (5.2)  are  easily  verified  to  result  in  equal- 
error  cells.  © 


'E?(ampCe  3:  A  muCtipCicativeCy  separaSCe  aptirnaC  action  function 

As  an  example  we  will  apply  Theorem  5  to  the  case  where  (p{x,y)  =  xy  on  E  =  [a,b]y.[c,d]c'R^.' .  Let 
^jc  =  b-a  and  ^y  =  d-c.  We  observe  fiom  Theorem  2  (because  d^<p/dxdy=  1)  that  this  function  satisfies 
the  hypotheses  of  the  Theorem  5,  including  that  e(a,x  {}'})  =  ^(Sup(T,-inf  cT/)^  increases  in  y — and 
therefore  y*  =  d — and  that 

£{{x]y.EY)  =  hx^y,  (B.ll) 

increases  in  x.  The  error  for  an  efficient  1  — >m  partition  is  found  from  (5.1)  to  be 

^J(p(b,y*)-qKa,y*^\^d^x  (B.12) 

2m  2m 

,--1/ 


To  compute  the  cell  boundaries  we  note  that  (p    (z)  =  z/d  and  from  (5.2)  we  have 


Supo-i  =  ^ 


,    -,  dAx 
ad  +  lk 


kAx 

=  a  + , 

m 


2m. 
indicating  that  each  cell  has  the  constant  width  Ax/m. 


(B.13) 


It  will  be  efficient  to  have  a  nonempty  immediate  action  interval  (TQ  =  [a,xo)  only  if  the  smallest 
possible  immediate  action  error  is  less  than  the  error  without  such  an  interval,  viz.  £{{a}  x£y)=  iaA.v<£; 
i.e.  a  0— >m  partition  is  required  when 

^-^>aAy.  (B.14) 

m 

When  (B.14)  is  satisfied,  we  find  io  by  equating  the  immediate  action  error  eo(W  =  i^oAy  and  the 
message  cell  error  e(A))  =  dib-Jio)/2m,  yielding 

jeo=— ^^— .  (B-15) 

mAy  +  d 

Substituting  (B.15)  into  eo(io).  we  find  that  the  error  for  the  algorithm  is 

^^l__bdAy_  (Bjg) 

2  mAy  +  d 
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A  calculation  similar  to  that  yielding  (B.13)  shows  that  the  message  cells  have  the  constant  width 

^^<-ii-  =  :«,  (B.n) 

mAy  +  a     may  +  a 

which  we  note  is  weakly  less  than  the  width  of  the  immediate  action  ceU. 

Summarizing  both  cases  we  find  the  error  of  any  efficient  partition  for  this  problem  to  be  i 
(respectively  i)  when  (B.14)  is  (respectively  is  not)  satisfied.  We  will  see  later  in  Theorem  8  that  we  can 
generalize  this  result  to  apply  to  any  muUiplicatively  separable  optimal  action  function,  i.e.  one  of  the 
form  q){x,y)=f(x)g(y),  providing  neither  f  (Ex)  nor  giEy)  contain  zero  as  an  interior  point. 

Comparative  statics  on  the  communication  Budget 

It  is  interesting  to  vary  the  number  of  bits  n  at  our  disposal — our  communication  budget — to  see  the 
effect  upon  the  structure  of  and  choice  of  instigator  for  efficient  monologs.  First  we  note  that  the  size  of 
a  nonempty  immediate  action  interval  for  an  efficient  algorithm  must  shrink  with  an  increase  in  m.  To 
see  this  assume  for  defmiteness  that  s{{x]  xEy)  increases  in  x.  If  to  the  contrary  m  and  xo  both  increased, 
then  eo(^o)  from  (5.3)  would  increase  but  e(io)  from  (5.4)  would  decrease,  destroying  the  required 
identity  £o(io('"))sf(io(m)). 

The  error  e  of  a  1  ^  m  partition  from  (5.1)  varies  inversely  with  2".  The  error  e  of  a  0  ^  m  partition 
from  (5.4)  falls  more  slowly  with  increases  in  the  communication  budget  because  the  numerator 
increases  with  m.  (This  results  from  the  increased  length  of  the  interval  over  which  the  transmitted  bits 
must  distinguish.)  We  see  this  in  Figure  6,  which  shows  the  X-instigator  monolog  error — with  and 
without  an  immediate  action  interval — as  a  function  of  the  communication  budget  for  (p(x,y)  =  xy  on 
[0.1, 1.1]  X  [0,  1].  (The  error  for  a  nonempty  immediate  action  interval  is  bounded  below  by 
e({0.1}x£y)  =  0.05.) 

We  have  so  far  studied  algorithms  which  are  efficient  within  the  class  of  X-instigator  monologs.  Now 
we  ask  the  question:  given  a  restriction  to  monolog,  who  should  instigate?  Return  to  the  above  example 
of  Figure  6.  The  reader  can  easily  make  the  parameter  interchanges  in  (B.12),  (B.14),  (B.15),  and  (B.16) 
required  to  find  the  solution  for  a  F-instigator  monolog.  With  one  bit  an  efficient  K-instigator  algorithm 
would  result  in  an  immediate  action  interval  of  [0, 11/31]  with  an  error  of  bdAx/2(2Ax  +  b)  =  11/62.  The 
best  X-instigator  monolog  would  have  an  error  of  11/60.  Therefore  Y  is  the  preferred  instigator  and  the 
K-instigator  algorithm  is  efficient  within  the  class  of  monologs.  Now  consider  four  bits.  The  best  Y- 
instigator  monolog  has  an  immediate  action  interval  of  [0,11/171],  yielding  an  error  of 
11/342  =  176/5472.  However,  with  an  empty  immediate  action  interval,  an  X-instigator  monolog  results 
in  a  smaller  error  of  1/32=  171/5472.  So  we  see  that  the  identity  of  the  optimal  instigator  can  depend  on 
the  size  of  the  communication  budget. 
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Figure  6:  For  low  n  it  is  efficient  to  have  an  immediate  action  interval. 

4.  Dialog  algorithms 

In  the  previous  section  we  restricted  attention  to  monolog  algorithms — those  in  which  any  bits  sent 
were  transmitted  by  the  instigator.  Monologs  do  not  exhaust  the  algorithmic  |)ossibilities,  of  course.  A 
dialog  is  an  algorithm  which  is  not  a  monolog:  for  some  6  the  algorithm  requires  each  partner  to  send  at 
least  one  bit.  We  presented  results  concerning  monologs  which  are  efficient  within  the  class  of 
monologs.  We  now  concern  ourselves  more  generally  with  the  problem  of  finding  algorithms  which  are 
efficient  within  the  larger  class  of  all  algorithms.  We  will  see  that  an  algorithm  which  is  efficient  within 
the  class  of  monologs  need  not  be  efficient  within  this  larger  class.  (For  the  remainder  of  the  paper  we 
use  "efficient"  in  this  stronger  sense.) 


Dialog  can  outperform  monolog 


Our  first  example  of  dialog  superiority  is  extreme — both  because  it  shows  a  dialog  which  is  infinitely 
superior  to  the  best  monolog  and  because  its  optimal  action  function  is  somewhat  pathological.  The 
second  example  is  more  reasonable;  its  optimal  action  funaion  is  well-behaved  and  dialog  shows  only  a 
modest  improvement  in  error  relative  to  monolog. 

An  e7(treme  CT^ampCe 

We  display  an  example  in  which  a  dialog  outperforms  every  monolog  which  has  the  same 
communication  length.  For  any  communication  length  greater  than  unity  the  dialog  error  for  this  optimal 
action  function  will  be  zero  and  the  monolog  error  will  be  positive. 


Consider  the  optimal  action  function 
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x  =  0, 

y=l.  (CD 

Otherwise, 

which  is  graphed  in  Figure  7.  Now  consider  the  following  X-instigator  dialog  algorithm:  X  sends  the 
message  "0"  if  .t  =  0  and  sends  "1"  otherwise.  If  "0"  was  sent,  Y  can  immediately  choose  y/  =  y  with  zero 
error  because  he  knows  X's  location  precisely.  If  "1"  was  sent  and  >-<  1,  then  Y  can  immediately  choose 
y/  =  0,  again  with  zero  error.  In  the  worst  case  "1"  was  sent  and  >>=  1.  In  this  case  Y  sends  any  message  at 
all,  after  which  X  chooses  y/=\  -x,  again  with  zero  error.  So  we  see  that  with  only  two  bits  a  dialog  can 
pin  down  the  value  of  q>(x,  y)  with  zero  error  even  in  the  worst  case. 

Now  we  claim  that  there  is  no  finite  communication  length  monolog  which  will  determine  the  value 
of  (p  with  zero  error  in  the  worst-case.  Consider  any  X-instigator  monolog.  If  X  immediately  acts,  either 
y<l  OT  y=l  will  yield  positive  error.  If  X  sends  a  message,  choose  as  a  test  case  y=l.  No  matter  how 
many  bits  X  sends  to  Y  we  can  choose  an  a:>0  such  that  Y's  estimate  of  ^(x,y)  will  have  positive  error, 
because  for  y=\  the  precise  determination  of  (p(x,y)  requires  that  the  value  of  x  be  known  precisely. 
Thanks  to  the  symmetry  of  the  optimal  action  function  we  know  that  this  conclusion  would  hold  if  Y 
instigated  instead. 


Figure  7:  Dialog  can  be  infinitely  better  than  monolog. 
SI  more  reasonaSCe  e?(ampCe 

We  return  to  the  optimal  action  function  <p(x,y)=xy  of  Example  3  and  choose  the  domain  E  to  be  the 
unit  square  [0,  l]x[0, 1].  We  exhibit  a  two-bit  (i.e.  m  =  4), X-instigator  dialog  which  outperforms  an 
efficient,  two-bit,  X-instigator  monolog.  As  the  monolog  performance  benchmark,  we  calculate,  using 
(B.15->  B.17),  that  this  monolog  error  is  Vio  with  an  immediate  action  region  over  [0,  Vs)  and  four 
equally  spaced  message  cells. 

In  the  superior  dialog,  X  immediately  acts  when  a:€<To^  =  [0,0.18468).  (See  Figure  8.)  Otherwise  X 
sends  the  first  bit,  transmitting  "0"  when  x€CTi'^  =  [0.18468,0.63054)  and  transmitting  "1"  when 
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A- €(72^  =  [0.63054, 1].  Y's  response  depends  on  X's  message.  If  Y  receives  "0",  he  takes  control  of  the 
next  bit.  Otherwise,  he  listens  for  a  second  bit  to  come  from  X. 


a\ 


(Tn 


IB 


CT^ 


erf 


oi 


Figure  8:  A  dialog  which  outperforms  a  best  monolog. 

When  Y  takes  control,  he  immediately  acts  if  >'ecTo^=  [0,0.41421).  Otherwise  he  sends  "0"  as  the 
second  bit  when  >'SC7/  =  [0.41421.0.7071 1)  and  "1"  when  >'€cr2*'  =  [0.7071 1, 1].  This  then  turns  the 
action  decision  back  over  to  X.  Note  that  when  Y  receives  "0"  from  X,  Y  faces  a  one-bit,  y-instigator, 
monolog  problem  on  the  rectangle  cti^X(T2^  and  solves  it  the  same  way  we  solved  the  n-b'\i,  X- 
instigator,  monolog  problem  over  an  arbitrary  rectangle  in  the  previous  section. 

When  x€<j2^  and  after  X  has  sent  the  first  bit,  she  then  faces  a  one-bit,  X-instigator  monolog  problem 
over  the  rectangle  ai^xEy.  The  solution  to  this  problem  dictates  that  she  use  the  second  bit  to  further 
refine  K's  knowledge  about  x  into  either  of  the  intervals  cr2/i'^  =  [0.63054,  0.8 1 527 )  or 
(T2B^  =  [0.81527, 1].  Y  then  takes  an  action. 

The  described  two-way,  two-bit  algorithm  results  in  equal-error  cells.'  (The  double-headed  arrows  in 
Figure  8  indicate  the  relevant  actor's  worst-case  information  set  corresponding  to  each  rectangle.)  This 
common  error  is  easily  verified  to  be  0.09234 <  Vio-  For  example,  when  X  sends  "0"  and  Y  responds 
with  "1",  X's  information  set  is  the  line  segment  [x]y.(j2^,  where  xea\^.  In  the  worst-case, 
X  =  sup  ai^  =  0.63054,  and  X's  best  guess,  from  (A.6),  is  v^  =  i(0.63054)(0.70711  +  1)  =  0.53820.  which 
results,  from  (A.7),  in  the  error  i(0.63054)(l  -0.7071 1)  =  0.09234. 


We  are  not  at  this  point  asserting  that  the  exhibited  dialog  is  efficient;  that  is  not  necessary  for  our  claim  that  it  is  superior  to  any 
efficient  monolog. 
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Monolog  is  efficient  with  additive  separability 

We  have  seen  above  that  monolog  need  not  be  sufficient  for  efficiency.  We  now  show  that  whenever 
the  optimal  action  function  is  additively  separable,  monolog  is  efficient — no  dialog  can  do  better.  By 
additively  separable  we  mean  that  the  optimal  action  function  is  of  the  form  (p(x,y)=f(x)  + g(y),  where 
f:  Ex-^^  and  g:  Ey—*  R  are  continuous.  In  order  to  prove  this  result  we  can  use  the  upcoming  Theorem 
8  to  justify  restricting  attention  to  optimal  action  functions  of  the  simpler  form 

<p(x,y)  =  x  +  y,  (C.2) 

on  the  closed  rectangular  region  E  =  ExxEy<:zU^. 

First  we  make  an  observation  about  this  additively  separable  optimal  action  function.  Let  ax  c  Ex  and 
<Jy<^Ey-  For  this  (p  and  for  all  generalized  rectangles  (T^xcT}'  the  error  of  an  immediate  action  by  X  is 
independent  of  x  and  the  error  of  an  immediate  action  by  K  is  independent  of  y.  To  see  this  we  note  that 

qK{x}xaY)=[((Kx,y):y€(jY}  =  {x  +  y:ye(TY],  (C.3) 

and  therefore 

£({x}  xcrj')  =  ^(sup{x  +  y:>'€(Ti'}-inf  {x  +  >':>'€<Tj'})  =  J(supcT)'-'nfo'r)-  (C.4) 

and  similarly  for  immediate  action  by  Y.  As  a  consequence,  every  a:  is  a  worst-case  x  and  every  y  is  a 
worst-case  y. 

Lemma  6A  says  that  it  is  never  efficient  to  have  a  nonempty  immediate  action  region.  Lemma  6B 
says  that  it  is  never  efficient  to  turn  control  of  any  remaining  bits  over  to  the  other  partner,  if  that  parmer 
will  use  those  bits  in  a  monolog.  The  proof  of  Theorem  6  then  uses  an  induction  argument  to  establish 
that  it  is  never  efficient  for  one  partner  to  ever  turn  control  of  any  remaining  bits  over  to  the  other 
parmer,  whether  or  not  that  partner  will  engage  in  a  monolog  from  that  point  forward.  Therefore  in  any 
efficient  algorithm  all  the  bits  are  sent  by  the  instigator — i.e.  only  a  monolog  can  be  efficient  for  an 
additively  separable  optimal  action  function,    j^  c^^.^,   ^^^    f^-^'-'   '^'-'    "^^^^  ^°  '-'^  " 

Let  <p{x,y)=x-^y  and  let  n>  1.  Any  n-bit,  X-instigator  algorithm — monolog  or 
dialog — on  a  generalized  rectangle  crxX(Tyc£  which  has  a  nonempty 


Lemma  6A 


immediate  acticm  region  for  the  instigator's  initial  action  will  be  outperformed  by  some  «-bit,  Y- 
instigator  monolog  whose  immediate  action  region  is  empty. 


Proof 


Consider  an  X-instigator,  /i-bit  algorithm  where  X's  control  of  the  first  bit  generates  a  partition 
cToU  tXi  u  (T2  =  <7x  where  Gq  *  0;  i.e.  which  does  have  an  immediate  action  region.  The  error  £^  of  this 
algorithm  is  at  least  as  large  as  the  error  of  immediate  action  eo^  =  Ksup  cty  -  '^f  cty).  This  algorithm  is 
dominated  by  an  equal-error  K-instigator  one-way  algorithm  which  results  in  an  error  no  larger  than' 
(su pcy- infer j')/2m<eo^^e''^-  (In  Figure  9a  the  two-headed  arrow  corresponds  to  the  oscillation  of 


We  do  not  assume  that  ffy  i*  convex;  at  worst  the  one-way  error  is  that  which  would  be  achieved  over  the  convex  hull  of  (ry 
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immediate  action  of  the  original  algorithm.  In  Figure  9b  the  shorter  two-headed  arrows  correspond  to  the 
oscillation  of  the  optimal  action  function  over  X's  information  set  when  Y  sends  a  message.)  © 


CTo 


Something 

happens  with 

n  bits 


(a) 


CT2' 


(b) 


Figure  9:  (a)  An  X-instigator  n-bit  algorithm  with  a  nonempty  immediate 

action  region  can  be  replaced  by  (b),  a  lower  error  y-instigator,  n-bit 

monolog  algorithm  which  has  no  immediate  action  region. 


Lemma  6B 


Let  (pix,y)  =  x  +  y  on  E  and  let  n>2.  Consider  any  «-bit.  X-instigator  algorithm 
on  (TxxcTy  czf'  which  for  some  x€<Tx  turns  control  of  the  second  bit  over  to  Y 

who  then  implements  an  (Ai-l)-bit.  monolog  algorithm.  Then  this  /x-bit,  X-instigator  algorithm  is 

dominated  by  an  n-bit,  ^-instigator,  monolog  algorithm. 


Proof 


From  Lemma  6A  we  know  that  without  loss  of  generality  we  can  assume  that  X's  control  of 
the  first  bit  generates  a  partition  cXi  u  <T2  =  fTx-  Assume  that,  in  response  to  some  message  z  €  { 1 , 2 }  from 
X,  Y  does  take  control  of  the  second  bit  and  implements  an  (n-l  )-bit  monolog  on  cr,  x  cry  specified  by  a 

partition  (cT,y^}y=i k,  of  ay,  where  k  =  2''~K  [Lemma  6A  assures  us  that  this  {n-  l)-bit,  K-instigator 

algorithm  will  not  have  an  immediate  action  region  and  therefore  c7,o   =0.]  For  ye  { 1.  ...,^}.  this  Y- 
instigator  algorithm  results  in  the  cell  errors 

£y  =  Ksup(T/-inf  cr/),  (6B.1) 

and  therefore  the  error  of  the  original  n-bit,  X-instigator  dialog  is  bounded  below  by  maxy(e,y}. 

Now  define  an  («-  l)-bit,  K-instigator  monolog  a   by  a  partition  {dy  }y=i *,  where  cry   =cr,y    for 


ye  (1 k\.  From  (6B.1),  ey  =  e,y  fory'e  {1,  ...,k\  so  that 

£^=  maxy  {fy}  =  maxy  {£,y}  <e^. 


(6B.2) 


This  {n  -  1  )-bit  monolog  weakly  outperforms  the  original  n-bit  dialog;  therefore  there  exists  an  n-bit 
monolog  which  strictly  outperforms  the  dialog.  © 
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Theorem  6 


The  minimum  achievable  error  for  «-bits  for  ^x,y)=f{x)-^ giy)  on  £,  where  / 
and  g  are  continuous,  can  be  achieved  by  a  monolog. 


Proof    I  We  first  prove  the  proposition  for  the  special  case  of  (p{x,y}  =  x  +  y  and  then  generahze  via 
Theorem  8.  Lemma  6B  tells  us  for  the  case  n  =  2  that  without  loss  of  generality  we  can  assume  that  any 
two-bit  algorithm  on  any  generalized  rectangle  is  a  monolog.  (If  a  two-bit,  X-instigator  algorithm  were  a 
dialog,  then,  for  some  x,  Y  would  send  the  second  bit.  However,  we  know  that  such  an  algorithm  would 
not  be  efficient  because  it  is  dominated  by  some  two-bit,  K-instigator  monolog.)  Now  consider  a;  =  3  and 
let  X  be  the  instigator  of  an  efficient  algorithm.  From  Lemma  6A,  X's  control  of  the  first  bit  generates  a 
partition  a\  u<T2  =  0"x-  We  know  from  our  just  previous  conclusion  for  n  =  l  that  we  can  assume  that  the 
two-bit  algorithm  spun-off  by  each  cr,,  /e  { 1,2},  is  a  monolog.  But  then  Lemma  6B  tells  us  that  these 
two  two-bit  algorithms  must  also  be  X-instigator  algorithms;  otherwise  our  algorithm  would  not  be 
efficient.  Therefore  we  can  without  loss  of  generality  assume  that  any  efficient  three-bit  algorithm  on 
any  generahzed  rectangle  is  a  monolog.  By  induction  on  n  we  prove  our  desired  result.  From  theorem  8 
we  know  that  the  communication  problem  for  (pix,  y)  =f(x)  +  g(y)  on  E  is  equivalent  to  the  problem 
<p(x,y)  =  x  +  y  on  £  =f(Ex)^ giEy).  Therefore  the  efficient  algorithm  for  the  general  additively  separable 
optimal  action  function  is  a  monolog.  Q 

Monolog  can  be  efficient  without  additive  separability 

We  have  seen  an  example,  viz.  <p(x,y)  =  xy  on  [0, 1]  x  [0, 1],  in  which  monolog  was  insufficient  for 
efficiency.  Theorem  6  tells  us  that  monolog  is  always  sufficient  for  efficiency  for  additively  separable 
optimal  action  functions.  This  raises  the  question  of  whether  additive  separability  is  not  only  sufficient 
but  actually  necessary  for  monolog  efficiency.  We  answer  this  question  in  the  negative  by  asserting  that 
a  best  monolog  is  efficient  for  <p(x,y)=xy  when  we  move  the  domain  to  [1 , 2]  x  [1 , 2]. 

This  demonstration  requires  more  development  than  required  for  the  earlier  example  of  monolog 
insufficiency.  In  that  case  it  was  sufficient  to  exhibit  a  dialog  which  outperformed  a  best  monolog;  we 
did  not  need  to  claim  that  the  exhibited  dialog  was  efficient.  Now,  however,  in  order  to  show  that  there 
exists  a  monolog  which  is  efficient  on  this  new  domain  we  must  prove  the  nonexistence  of  a  better 
dialog  and  therefore  we  must  find  a  best  dialog.  Theorem  7  tells  us  that,  for  two  bits  and  when  the 
optimal  acticm  function  is  multiplicatively  separable,  we  can  greatly  restrict  our  consideration  to  an 
extremely  small  subset  of  the  great  variety  of  conceivable  algorithms.  This  makes  the  construction  of  an 
algorithm  to  find  a  best  dialog  practical.  Running  this  algorithm  for  <p(x,y)=xy  on  [1,2]  x[  1,2]  resulted 
in  the  finding  that  every  candidate  for  an  efficient  dialog  was  dominated  by  a  monolog. 

Before  stating  the  theorem  we  will  develop  some  notation  for  discussing  two-bit,  X-instigator 
algorithms.  X's  control  of  the  first  bit  generates  a  partition  ctq  u  cxi  u  cT2  =  £'x  =  [a,  ft].  If  the  algorithm 
specifies  that  when  xe<Ti,i€(l,2),X  maintains  control  of  the  second  bit  as  well,  then  the  panition 
<'"i0*^O",iU<T/2  =  cr,  is  generated;  i.e.  message  /  leads  to  a  one-bit,  X-instigator  algorithm  on  cTixEy.  If 
instead  the  algorithm  specifies  that  when  xea,,  /€  { 1,2},  K  takes  control  of  the  second  bit,  then  the 
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partition  <T,o  ucT,i   u<T/2   =EY  =  [c,d]  is  generated;  i.e.  message  /  leads  to  a  one-bit,  K-instigator 
algorithm  on  (JiXEy. 


Theorem  7 


'ine  minimum  error  achievable  by  an  X-instigator  algorithm  for  the  optimal 
action  function  (p(x,y)=xy  on  £clR  +  ^  can  be  achieved  by  an  algorithm  of  the 


following  form: 

0    The  cells  cTq.  O"!.  ^d  <^2  ^e  convex;  aosa;  and  infcTj  <info"2- 

©     If  cr,,  /€{  1,2},  leads  to  a  one-bit,  X-instigator  algorithm,  then  its  immediate  action  region  is  empty, 
i.e.  cr,o  =  0. 

(3)     If  the  one-bit  instigator  for  cti  is  different  than  for  (72,  then  Y  instigates  for  (Ti  and  X  instigates  for 

CT2. 


Proot  I  The  convexity  of  Oq  and  the  emptiness  of  cT/o  both  follow  from  the  monotonicity  of 
£i{x}  y-Ey).  (If  j:€ct,o,  then  we  could  without  loss  of  generality  incorporate  the  interval  [a,x]  into  ctq) 
Consider  j€  { 1 . 2} .  If  cr,  leads  to  Y  instigating  with  the  remaining  bit,  then  the  error  over  ai  is 


e,^=max{e,o^e./.•e^7^}, 


where 


e/o^=iAa,sup<T,o^ 


e,/=iAa,/ super,. 


forye  {1,2},  where  Aa^supa- inf  cr.  This  error  would  be  unchanged  by  the  convexification  of  cr,-. 
This  error  would  be  decreased  by  shifting  a,  to  the  left,  because^this  would  leave  Ac,  -unchanged  and 
would  decrease  super,. 

If  (J,  leads  to  X  instigating  again,  the  error  over  ai  ise,^=  max  {e,i^,e,2^}.  where  e,/  =  \dAaij, 
;6  { 1, 2}.  Each  e,/  would  be  unchanged  if  we  convexify  and/or  translate,  either  to  the  right  or  left,  the 
cell  <T,y.  If  necessary  we  can  convexify  <?,  by  translating  cr,i  and  (T,2  to  the  right  in  order  to  make  them 
adjacent  and  translating  a^-i  to  the  left,  which  weakly  decreases  the  error  of  this  cell.  This  process  also 
establishes  ®.  © 

5.  Separability  and  Monotonicity 

We  have  thus  far  imposed  monotonicity  requirements  upon  the  optimal  action  function  q>  and  the 
errors  e({x}  XfTy)  and  £(tTxX  {y})  on  the  domain  E.  Theorem  8  allows  us  to  relax  these  assumptions  in 
many  cases.  Denote  the  n-bit  efficient  communication  problem  for  the  optimal  action  function  (p  over  E 
by  {(p,  E).  U  q>  does  not  satisfy  the  monotonicity  assumptions  on  E,  then  we  seek  to  transform  (<p,  E)  into 
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an  equivalent  problem  i^,^)  where  <p  satisfies  the  assumptions  on  E.  The  transformed  problem  would 
then  be  amenable  to  analysis  using  the  methods  we  have  developed  thus  far. 

Consider  an  optimal  action  function  (p,  represented  as  separable  in/(jc)  and  g(y),  i.e.  such  that  for  all 
(.v,y)€£, 

(fKx,y)  =  g)(f(x),g(y)\  (D.l) 

for  some  function  (p.  Defme  ^x=f(Ex),  £Y  =  g(EY),  and  £  =  ExX^y-  Then/:  Ex-^'^,  g- Ey^^,  and 
0:E—*U.  The  separation  in  (D.l)  can  be  performed  trivially  for  any  <p  (e.g.  let  <p  =  <p,  f(x)=x,  and 
giy)=y);  the  challenge  is  to  find  a  nontrivial  decomposition  which  allows  the  desired  transformation  of 
the  original  communication  problem.  We  require  that  /  and  g  be  continuous  so  that  Ex  and  Ey  are 
intervals  and  thus  £  is  a  rectangle. 

When  we  have  decomposed  the  optimal  action  function  k  la  (D.l)  we  can  conceive  of  the 
communication  problem  facing  X  and  Y  in  either  of  two  ways:  0  to  communicate  their  private 
information  x^Ex  and  ysEy  in  the  manner  we  have  studied  thus  far  and  then  to  approximate  (p(x,  y)  or 
(D  to  communicate  u=f(x)e£x  and  v  =  g(j/)€EY  and  then  to  approximate  <p(u,v).  Because  the  optimal 
action  function  is  sensitive  only  to  the  images  of  x  and  y  under/  and  g,  respectively,  these  two 
communication  problems  are  equivalent.  Theorem  8  justifies  this  intuitive  argument  that  the  error  of  the 
communication  problem  {<p,E)  is  equal  to  the  error  of  the  problem  i<p,£). 


Theorem  8 


Consider  a  separable  representation  of  <p  as  in  (D.l).  Then  the  error  of  any 
efficient  algorithm  R  for  the  communication  problem  (cp,  E)  is  equal  to  the  error 


of  any  efficient  algorithm  R  for  the  communication  problem  i0,E). 


Proof 


An  algorithm  defines  for  each  partner  an  operation  (which  may  be  null,  an  action,  or  the 
transmission  of  a  message)  to  be  performed  at  each  time  t  as  a  function  of  that  partner's  coordinate  and 
the  history  h,  of  communication  up  until  that  time.  Specifically,  it  defines  a  sequence  of  functions 

{a,(.h„x)},=o m  forX  and  [fit(h„y)},=o,...^  for  Y  such  that  at  any  time  /,  and  given  the  history  h,,X 

performs  the  operation  a,  =  ai(h„x)  and  Y  performs  the  operation  b,=P,(h„y). 

We  construct  the  sequence  {di(,h„x)}t=o....jn  of  operation  functions  for  X  for  the  algorithm  R  from  the 
original  sequence  for  /?  in  the  following  way.  For  each  ue^x>  arbitrarily  choose  some  element  of  its 
inverse  image,' 

M")  =  typrV«)-  (8.2) 

For  each  t,  each  u  e  £x>  and  each  h,,  set 

d,{h„u)  =  a,{h„n{uy).  (8.3) 


The  function  name  typ  might  be  inteqneted  as  "typical"  or  '1[ake]  y(oiirl  p[ick]."  We  use  it  because  other  techniques,  e.g.  taking  the 
infunum  or  the  minimum,  can  fail.  (The  infunum  may  not  belong  to  the  set;  the  minimum  may  not  exist.)  It  is,  of  course,  crucial  that 
typ  truly  be  a  function,  i.e.  uniquely  defined.  The  only  other  important  property  of  typ  5,  for  S*0,  is  that  typ5€5. 
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Similarly,  for  each  ve^y,  let 

7(v)  =  typ^''(v).  (8.4) 

For  each  t,  each  vG^y,  and  each  h,,  set 

0,(h„v)=P,(h„r(v)).  (8.5) 

Denote  the  action  resulting  from  R  for  {x,y)eE  by  Y(x,y),  which  results  in  an  enoT£(x,y).  Denote  the 
action  resulting  from  A  for  («,  v)e^  by  y/(u,v),  which  results  in  the  error  e(w,v).  The  sets  of  achieved 
errors  under  R  and  R,  respectively,  are 

^  =  {e(x,y):{x,y)€E},  (8.6) 

!={£(«.  v):(a.v)e^}.  (8.7) 

From  (8.3)  and  (8.5)  we  see  that  X  and  Y  will  behave  at  iu,v)€£  under  algorithm  R  exactly  as  they 
would  have  at  (//(«),  y(v))e£  under  algorithm  R.  In  particular, 

\l^(u,v)  =  y/(M(u\y(v)),  (8.8) 

and  therefore 

£(«,v)=£(//(M),y(v)).  (8.9) 

Because  iu(u)eEx  and  yivyeEy,  we  see  that  for  all  (a,  v)e^,e(M,v)€i^,  therefore  i  <z^.  and  therefore 
supl <  supi^.  Therefore  the  error  R  is  weakly  less  than  the  error  of  R.  However,  the  error  of  R  cannot  be 
strictly  less  than  that  of  R  because  R  is  efficient.  (Otherwise  the  algorithm  R  onE  defmed  by 

a(h„x)  =  d,(h,J(x)),  (8.10) 

0(h„y)=0,(ih„g(y)),  (8.11) 

would  then  have  an  error  i=i<e.)  Therefore  the  error  of  ^  is  equal  to  the  error  of  ^.  © 


In  Figure  10  we  graph  the  optimal  action  function  <p{x,y)  =  s'\n  4;rx  +  sin  4;r>'  on  [0, 1]  x  [0. 1],  which 
obviously  violates  our  monotonicity  assumptions.  Letting  f(,x)  =  s\n  4;:x,  g(y)  =  s\r\  4;iy,  and 
(p(u,v)  =  u  +  v,  we  can  then  write  9)  in  the  separable  form  (D.l).  Therefore  we  need  to  solve  the  problem 
of  <^«,v)  =  u  +  v  on^=/([0,l])x^([0, 1])  =  [-1,  l]x[-l,  1].  We  know  from  Theorem  6  that  monolog  is 
sufficient  for  efficiency  and  that  we  should  have  an  empty  immediate  action  interval.  The  problem  is 
symmetrical  in  x  and  y;  we  use  (5.1)  to  calculate  the  efficient  error  to  be  (b-a)/2m=  l/m  and  the  cell 
boundaries  to  be  u*  =  2k/m  -\,k€M. 

The  proof  of  Theorem  8  [see  (8.3)]  tells  us  that  we  find  the  partition  of  E  for  the  original  problem 
from  the  partition  of  £  by  finding  the  inverse  image  of  each  ceU  d,,  i.e. 

(T,=rkdi)  =  -l-sin-i(d,), 

4^  (D.2) 
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for  ieM,  where  by  sin"'  we  mean  the  inverse  image  and  not  just  the  principal  value.  Figure  1 1  shows 
the  one-bit  case  for  simplicity.  The  vertical  u  axis  on  the  right-hand  side  shows  the  convex  partition  a  of 
^.  The  two  cells  of  the  partition  a  for  the  nonmonotonic  optimal  action  <p  each  consists  of  the  union  of 
two  disjoint  intervals,  each  of  which  has  the  same  image  under  /. 


(p(,x.  y)  =  sin  4;rx+  sin  4;ry 


Figure  10:  A  nonmonotonic  but  separable  function  can  still  be  studied. 

Similarly,  we  can  study  with  the  techniques  we  have  already  developed  many  multiplicatively 
separable  problems — i.e.  where  the  optimal  action  function  is  of  the  form  <p(x,  y)  =f(x)g(y) — even  when 
this  (p  does  not  itself  satisfy  the  monotonicity  assumptions,  because  0(u,  v)  =  av  does  satisfy  those 
requirements  on  any  rectangle  £  as  long  as  neither  ^x  nor  Ey  contain  zero  as  an  interior  point.  (Note  that 
the  error  of  immediate  action  at  u  is  e({«}  x&Y)  =  \u\AaY>  which  is  not  monotonic  in  u  over  any  Ex 
which  contains  zero  in  its  interior.) 

6.  Concluding  Remarks 

We  have  presented  a  model  of  bounded  rationality  resulting  from  costly  communication  in  a  two- 
member  team  context  in  which  the  mechanism  designer  confronted  the  tradeoff  between  increased 
expenditure  of  resources  for  communication  and  increased  accuracy  of  the  solution.  Our  measure  of 
communication  emphasized  that  communication  is  costly  because  it  is  time  consuming. 

We  completely  solved  the  efficient  monolog  problem  for  a  large  class  of  optimal  action  functions 
which  satisfied  either  monotonicity  or  separability  assumptions.  This  solution  included  the 
determination — for  any  given  length  of  communication — of  the  optimal  instigator,  the  cell  boundaries  of 
the  partition,  the  action  decision  rules,  and  the  resulting  error.  Having  computed  the  error  as  a  function 
of  communication  length,  we  have  thus  characterized  the  tradeoff  between  the  accuracy  and  cost  of 
communication.  We  have  shown  that  there  exist  optimal  action  functions  such  that  for  some  realizations 
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of  the  private  infonnation  there  will  be  no  communication,  even  though  the  right  to  communicate  has 
already  been  paid  for.  The  explanation  is  that  remaining  sUent  in  optimally  chosen,  predefined  situations 
sufficiently  increases  the  informativeness  of  communication  in  the  other  cases  that  this  more  than 
compensates  for  the  error  of  the  no-communication  actions.  We  also  saw  that  the  identity  of  the  optimal 
instigator  could  depend  on  the  size  of  the  communication  budget. 


1/>S. 


(T2=r(<T2) 


Figure  1 1 :  The  cell  boundaries  for  the  original  problem  are  found 
from  the  boundaries  of  the  transformed  problem. 

We  have  paid  particular  attention  to  the  question  of  whether  efficiency  requires  that  both  members 
speak  to  each  other  in  a  dialog  rather  than  that  one  member  engage  in  a  monolog  directed  to  the  other. 
We  exhibited  examples  showing  that  the  answer  to  this  question  could  in  general  go  either  way.  If  the 
optimal  action  function  is  additively  separable,  however,  we  showed  that  one  can  restrict  attention  to 
monologs  without  loss  of  generality.  Additive  separability  corresponds  to  an  independence  of  the 
parmers'  private  data. 

Admittedly,  the  information  structure  in  our  model — a  single  real  number  known  by  each  team 
member — might  seem  too  austere  to  realistically  pose  a  significant  communication  challenge.  Our 
current  work-in-progress  includes  enriching  the  present  framework  by  endowing  each  member  with 
private  knowledge  of  a  point  in  a  multidimensional  space.  Our  research  in  this  direction  has  already  been 
informed  by  both  the  intuitions  and  the  concrete  results  of  the  present  work.  We  have  found  that 
interesting  multidimensional  problems  decompose  into  separate  instances  of  one-dimensional  problems 
of  the  type  we  have  treated  here  and  thus  directly  require  application  of  techniques  we  have  developed  in 
this  paper. 
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We  see  the  present  paper  as  one  step  in  a  research  program  dedicated  to  explaining  features  of 
organizational  structure  which  are  insufficiently  understood  when  the  details  of  the  communication 
process  are  ignored. 
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